Stability and Generalization of Decentralized Stochastic Gradient Descent

نویسندگان

چکیده

The stability and generalization of stochastic gradient-based methods provide valuable insights into understanding the algorithmic performance machine learning models. As main workhorse for deep learning, gradient descent has received a considerable amount studies. Nevertheless, community paid little attention to its decentralized variants. In this paper, we novel formulation descent. Leveraging together with (non)convex optimization theory, establish first guarantees Our theoretical results are built on top few common mild assumptions reveal that decentralization deteriorates SGD time. We verify our findings by using variety settings benchmark

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Recent work shows that decentralized parallel stochastic gradient decent (D-PSGD) can outperform its centralized counterpart both theoretically and practically. While asynchronous parallelism is a powerful technology to improve the efficiency of parallelism in distributed machine learning platforms and has been widely used in many popular machine learning softwares and solvers based on centrali...

متن کامل

Robust Decentralized Differentially Private Stochastic Gradient Descent

Stochastic gradient descent (SGD) is one of the most applied machine learning algorithms in unreliable large-scale decentralized environments. In this type of environment data privacy is a fundamental concern. The most popular way to investigate this topic is based on the framework of differential privacy. However, many important implementation details and the performance of differentially priv...

متن کامل

Stability and optimality in stochastic gradient descent

Iterative procedures for parameter estimation based on stochastic gradient descent allow the estimation to scale to massive data sets. However, in both theory and practice, they suffer from numerical instability. Moreover, they are statistically inefficient as estimators of the true parameter value. To address these two issues, we propose a new iterative procedure termed AISGD. For statistical ...

متن کامل

Online Learning, Stability, and Stochastic Gradient Descent

In batch learning, stability together with existence and uniqueness of the solution corresponds to well-posedness of Empirical Risk Minimization (ERM) methods; recently, it was proved that CVloo stability is necessary and sufficient for generalization and consistency of ERM ([9]). In this note, we introduce CVon stability, which plays a similar role in online learning. We show that stochastic g...

متن کامل

Data-Dependent Stability of Stochastic Gradient Descent

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD) and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worstcase constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i11.17173